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Abstract 

This paper describes the design of a major development and research project within the UK’s 
Teaching and Learning Research Programme. It outlines the project’s aims to enhance teaching 
and learning in schools through innovative practice at classroom and school level, and through 
networking. It describes the assumptions and principles on which the design of the study was 
based and how the overall plan for the research evolved. This account is located within debates 
about research quality, which are current within both the USA and the UK, and specifically in 
discussions of issues surrounding research capacity building. It concludes that this complex 
intervention study is a variant of the ‘design experiment’ and offers opportunities for deepening 
research capacity by knowledge creation and dissemination both within and beyond the research 
team. 



’ This paper is based on the work of ‘Learning How to Learn - in classrooms, schools and 
networks'. This is a four year development and research project funded from January 2001 to 
March 2005 by the UK Economic and Social Research Council (ref: LI 39 25 1020) as part of 
Phase II of the Teaching and Learning Research Programme (see http://www.tlrp.org ). The 
Project is directed by Mary James (University of Cambridge) and co-directed by Robert 
McCormick (Open University). Other members of the team are: Carmel Burgess, Patrick 
Carmichael, David Frost, John MacBeath, David Pedder and Sue Swaffleld (University of 
Cambridge), Paul Black, Bethan Marshall and Joanna Swann (King's College London), Leslie 
Honour and Richard Procter (University of Reading) and Alison Fox (Open University). Past 
members of the team are Geoff Southworth, University of Reading (until March 2002), Colin 
Conner, University of Cambridge (until April 2003) and Dylan Wiliam, King's College London 
(until August 2003). Further details are available at: 
http://www.leamtoleam.ac.uk . 
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A Shared Concern but a Different Response 

At this time, the United States and the United Kingdom are travelling together in some disputed 
territories. The special edition of Educational Researcher (volume 31, number 8, 2002) on 
‘Scientific research in education’ was careful to invite ‘dialogue’ and ‘conversation’ on what 
constitutes quality in educational research and to encourage a ‘productive’ debate. However, US 
federal policy^, which seeks to mandate certain methods, namely experimentation in the form of 
randomised controlled trials, as a condition for receipt of federal funds is obviously deeply 
worrying. Nothing similar has yet been proposed in the UK, although the issues that have 
stimulated this response are familiar on our side of the Atlantic, and British educational 
researchers are wary of what might be around the next comer. On both sides of the Atlantic, the 
response of the research community seems to be broadly similar. Although there are those who 
vociferously oppose all grounds for increasing the level of government control, the majority are 
engaging with the issues that gave rise to these moves in order to find a position that might 
satisfy both the producers and the funders and consumers of research. 

The policy-maker’s position might be summed by the comment of a superintendent of a school 
district in England who said, ‘Look, 1 have £60 million to spend on education each year. 1 want 
to spend it wisely and on the best evidence available to me. If researchers don’t provide it, I will 
base my decisions on my own best hunches because I have to spend the money anyway, or lose 
it’. His priority was not to ‘understand’ the experience of students and teachers per se, however 
interesting that might be, but to provide a service that would maximise students’ achievements. 
Although understanding is a legitimate and important goal for research, so is his concern for 
‘what works’ evidence, although the concept needs, rightly, to be problematised. Thus, on both 
sides of the Atlantic there has been vigorous debate about what kinds of research provide the best 
basis for ‘evidence-informed policy’. In the US, the debate seems to revolve around what counts 
as ‘scientifically-based research’. In the UK the focus is similar but extends to encompass, 
perhaps more fully, issues concerning the contribution of users to the research enterprise and 
how knowledge from research can be transformed in ways to achieve impact in communities of 
practice. 

So far, the response of the British Government to these debates, especially the criticisms of 
Hillage et al. (1998) and Tooley and Darby (1998) from within the academy, has not been to 
mandate particular kinds of educational research but to fund the largest ever programme of 
educational research in the UK: the Teaching and Learning Research Programme (TLRP). With 
£26m contributed over nine years (2000 to 2008) by the Higher Education Funding Council for 
England, the Department for Education and Science, the Scottish Executive, the Welsh 
Assembly and the former Northern Ireland Executive the stakes are high and some have 
described the initiative as ‘Last Chance Saloon’ - to borrow another metaphor from our 
American friends. Management by the Economic and Social Research Council is designed to 
provide a suitable degree of independence from direct govenunent control although the projects, 
networks and fellowships funded within the programme are expected to subscribe to a set of 
common aims: 



^ See Goal 4 in http://www.ed.gov/pubs/stratplan2QQ2-Q7/index.html 



• to work to achieve significant improvements in learning outcomes for identified groups of 
learners; 

• to work in authentic settings of teaching and learning; 

• to bring multi-disciplinary or interdisciplinary approaches to research, and, where 
appropriate, involve practitioners, learners and other potential beneficiaries in the research 
process; 

• to enhance the capacity for a research-based approach to education and training practices; 

• to work in partnership with practitioners, learners, policy makers and others in the research 
community, to achieve maximum impact through transformation of the research results into 
actionable strategies and practices; 

• to make research-based contributions to the fundamental understanding of teaching and 
learning. 



Research Capacity Building 

The building of research capacity across the educational system and within universities in 
particular, has been considered sufficiently important and urgent to warrant the funding of a 
dedicated Research Capacity Building Network (RCBN)^ within the TLRP. A preparatory 
review of research capacity in UK universities (McIntyre and McIntyre, 2000) indicated that 
many of those involved in research lacked expertise, that some skills do not exist in some 
institutions and regions, and that the expertise that does exist tends to be concentrated in 
qualitative approaches. As a consequence small-scale local studies predominate and much 
research is of poor or mediocre quality. These features of UK research are related to features of 
the (ageing) workforce in UK university departments of education. Many UK academics entered 
universities as teacher educators after an early successful career as school teachers. However, the 
Research Assessment Exercise, which is now an important basis for the allocation of funds to 
universities, has put pressure on all to aspire to become researchers of national or international 
standing. Although many academics hold PhDs, this is still not universally the case, and those 
who hold them have often achieved them after part-time study working on a small scale project 
of their own. This has predisposed many to adopt qualitative methodologies that they have then 
taught to their own students. The situation is changing and ESRC recognised courses of research 
training now expect students to be taught the range of methodologies. However we are still some 
way from the kind of provision for faculty and graduate students enjoyed in the US. Some UK 
colleagues would argue that there is a trade-off here because what graduate students lack in 
methodological expertise is compensated by the credibility that ex-teacher researchers have 
within the communities they wish to research. This may also explain why the nature of the 
relationship between researchers and practitioners in the research enterprise is so high on the 
agenda of debate in our country. 

These issues of variability in the workforce, added to issues of diversity to do with the eclectic 
mix of disciplinary and subject backgrounds amongst those who do educational research, the 
multiplicity of value-perspectives on educational ends and means, and the variety of settings, 
curricula and forms of governance in education, make the integration of domains, cultures and 



^ See http://www.cardiff.ac.uk/socsci/capacitv 



methods extremely difficult. David Berliner’s (2002) description of educational research as ‘the 
hardest science of all’ therefore seems apposite. This does not detract however from the need to 
strive to improve the quality of what we do. The particular priorities have therefore been: 

• the development of skills in the design, conduct and management of quantitative studies, 
including experimental, quasi-experimental and survey techniques, capable of evaluating the 
effects of teaching and learning upon learner’s attainment across various contexts: 

• enhancing the theoretical and conceptual bases for such studies; 

• the articulation/combination of qualitative approaches with quantitative studies; 

• the greater utilisation of interdisciplinary theories and methods; 

• the transformation of research based knowledge through to its embodiment in practices 
relevant to enhancing learner attainment. 

The TLRP has explicitly set out to tackle these issues, through the work of the RCBN, but also 
by expecting each project, of which there will be more than thirty after the launch of Phase III of 
the Programme in June 2003, to address the need to build research capacity as an aspect of its 
work. The central questions they need to address are set out as follows^: 

• Recording changes in learning outcomes. How can appropriate and robust evidence be 
gathered on changes in the wide range of learning outcomes of interest to TLRP? 

• Quality of educational research. The critique of recent years has challenged researchers in 
terms of validity, reliability, warrants, causality, etc. and many other issues. Responses 
have reflected the diversity of the paradigms from which educational researchers draw. 
How far is this satisfactory? To what extent can a reconciliation be achieved? 

• Developing research capacity. The aspiration is that TLRP should contribute to 
broadening the expertise available for educational research in the UK. What can we learn 
from our experience so far, and how can we move forward in the future? 

In the remainder of this paper we discuss some of the ways that our project, within Phase II of 
the TLRP, is trying to provide some answers to these questions. 



The Learning How to Learn Project: the origins of an idea 

The ‘Learning How to Learn - in classrooms, schools and networks’ (L2L) Project is an attempt 
to design a collaborative development and research initiative with the twin goals of achieving 
high relevance and high scientific quality^. In essence this was an expectation placed upon the 
project through participation in the TLRP for the reasons given above. It is a four year project 
(Jan 2001-March 2005) involving four universities (Cambridge University, King’s College 
London, the Open University, and Reading University) working in partnership with 43 
elementary and high schools in seven school districts (Essex, Hertfordshire, Kent, Medway, 
Oxfordshire, Redbridge and Somerset). The idea for the project came from two sources, both 



^ See http://www.tlrp.org/themes/index.html 

^ ‘Scientific’ is understood here in a broad sense, not narrowly restricted to experiments, but nevertheless striving to 
meet the kind of criteria set out by the US National Research Council Committee (NRC, 2002). 
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intending to build on previous work and research expertise of the researchers in the different 
institutions. 

The first source was work on assessment for learning. James (Cambridge) was convenor of the 
UK’s Assessment Reform Group^ (ARG) at the time when it commissioned from Black and 
Wiliam (King’s London) a review of research of the impact of assessment on classroom learning, 
in order to up date aspects of an earlier review by Crooks (1988). This new review (Black and 
Wiliam, 1998a), which paid particular attention to fairly carefully controlled, small scale, 
focused experiments, reported that there was conclusive evidence that formative assessment 
(now more familiarly referred to in the UK as ‘assessment for learning’) improves learning and 
attainment. Indeed, the gains in attainment appeared to be quite considerable and, with effect 
sizes from 0.4 to 0.7, amongst the largest ever reported for educational interventions. The 
dissemination of this review through publication in a teacher-friendly pamphlet (Black and 
Wiliam, 1998b), talks given to teachers by researchers, and follow-up publications from the 
Assessment Reform Group (ARG, 1999, 2002), contributed to an unusual level of take up of 
ideas by practitioners and policy groups. For example, the UK Government’s recommended 
strategy for teachers of 1 1 - 14 year old students (ref: DfES 0350/2002) includes a module on 
‘assessment for learning in everyday lessons’ that derives from this work. The Qualifications and 
Curriculum Authority has also dedicated a section of its website to this theme^. 

Black and Wiliam subsequently undertook a development project involving 24 math and science 
teachers in six high schools in two school districts. This concentrated on promoting new 
practices in four aspects of effective assessment for learning: developing classroom dialogue, 
especially questioning; using narrative comments, rather than scores, in feedback on students’ 
work; peer- and self-assessment; and, the formative use of summative tests. The results of this 
work indicate that teachers’ and students’ expectations, and classroom culture, can be changed 
and that these changes are associated with gains in attainment (in this case the average effect size 
was around 0.3). The researchers claim that, ‘Such improvements, produced across a school, 
would raise a school in the lower quartile of the national performance tables to well above 
average’ (Black, et al., 2002:4). Here’s the rub! How does one spread knowledge and promote 
changes in practice across teachers and schools? Moreover, how can one achieve ‘leverage’ 
using minimum resource for maximum impact? These questions engage with a second field of 
educational research, associated with school improvement and, more generally, with educational 
change. 

Colleagues at Cambridge University have worked in the field of teacher professional 
development and school improvement for many years, both as educators and researchers (see, for 
example, Bradley, Conner and Southworth, 1994; Frost et al., 2000; James, 1998; MacBeath and 
Mortimore, 2001). Much has been learned about ways to create the conditions for school 
improvement and there has been a discernible shift from an earlier emphasis on school structures 
and staff roles and responsibilities, which are at one remove from the primary processes of 
schooling, towards a more explicit focus on conditions and processes that support effective 
teaching and learning. For example, MacBeath and Mortimore (2001:21) argue that critical for 
school effectiveness and school improvement is a better understanding of the when and the how 



* See httD://www.assessment-reform-group. org.uk 
’ See http://www.qca.org.uk/ca/5-14/af1/ 
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of learning. This may seem to be stating the obvious but it is remarkable how infrequently school 
improvement researchers have engaged with issues of pedagogy (and vice versa). These two 
fields of classroom-level research and school-level research are vast in themselves and bringing 
them into alignment may seem too difficult. However, unless some attempt is made to 
understand what effective learning looks like, what teaching practices promote this, and what 
kinds of professional development and institutional conditions and cultures help teachers to leam 
new practices, we have little chance of spreading and sustaining the kinds of improvements 
observed in the studies reviewed by Black and Wiliam. 

The L2L Project was therefore set up to respond to this challenge: to take forward the classroom 
work begun by Black and Wiliam and link it with the school improvement work of Cambridge 
colleagues. Moreover, in anticipation of schools becoming only one site for learning in the 
electronic age, as the creation of dispersed but networked learning communities increases, the 
research interests of other colleagues from Reading University and the Open University, who are 
especially interested in electronic media for networking, were drawn in. This enabled a three 
level project to be designed which would seek to investigate what learning how to leam might 
mean for teachers and students in classrooms, in relation to organisational learning, and for 
learning across networks of teachers and schools. Specifically, the project aims to advance both 
understanding and practice of learning how to leam in classrooms (level 1), schools (level 2) and 
networks (level 3) and to: 

• develop and extend recent work on ‘assessment for learning’ into a model of learning how to 
leam for both teachers and pupils; 

• investigate what teachers can do to help pupils to leam how to leam; 

• investigate what characterises the school in which teachers successfully create and manage 
the knowledge and skills of learning how to leam; 

• investigate how educational networks, including electronic networks, can support the 
creation, management and transfer of the knowledge and skills of learning how to leam; 

• attempt to develop a generic model of innovation in teaching and learning that integrates 
work in classrooms, schools and networks. 



Design assumptions and principles 

The project design was based on the following theoretical, methodological or practical 
assumptions. 

• The concepts and practices underpinning ‘assessment for learning’ cohere theoretically 
with concepts and practices associated with ‘learning how to leam’; thus work to 
promote AfL is expected to enhance L2L although this hypothesis needs to be tested and 
the relationships properly examined. 

• The development of new AfL/L2L practices need to be stimulated through some form of 
intervention because previous research indicates that they are not widespread in 
teachers’ current practice. 
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• Intensive intervention by university people cannot be sustained in the long term and 
across whole systems therefore the intervention needs to be designed in accordance with 
a principle of parsimony. 

• Adaptation of the intervention is likely, and to be encouraged*, because the project has 
an interest in discovering and recording how teachers create new knowledge as they try 
out, and adapt, practices in their particular contexts i.e. in their schools, with their 
pupils, teaching an aspect of their subject. 

• Development and research have an iterative relationship because the research design 
needs to be sensitive to these adaptations in context in order to capture them. 

• In order to test portability of ideas and practices, implementation needs to be 
encouraged in a sufficient range of school contexts - urban/rural, large/small, mono- 
ethnic/multi-ethnic, elementary/high schools, and across different subject domains. 

• Supporting development work in a large sample of schools (n.43), even on a 
parsimonious scale (with an allocation of two days of an academic’s time per school per 
year) requires a large team, especially when this is added to research tasks. 

• A large project team provides an opportunity for the creation of a multidisciplinary 
group who learn from one another and deepen their research capacity even if none can 
claim to be a “compleat researcher” (Gorard, 2002). This acknowledges that in any 
community, knowledge is distributed in ways that are functional to its survival. 

• Implementation of project ideas over time needs to be monitored and described, 
requiring thick descriptions of practice in classrooms, of aspects of school policy, and of 
wider contextual influences that impact on classroom practice. 

• Qualitative data need to be combined with quantitative data so that comparisons across 
schools and over time, in relation to important features, can be facilitated. 

• Data need to be collected about defined outcomes, processes and contexts and the 
relationships among these explored. 

• Explanation needs to address the issue of causality although association should not be 
taken to imply a causal relationship, nor the direction of causality. 

• The activity of academics and researchers, and the differences among them in their 
approach, need to be taken into account through reflexivity. 

• A theorised audit trail approach offers the best strategy for generating plausible 
explanations, linking processes to outcomes, in the complex setting of this project. 

• Plausible rival explanations should also be deliberately sought to strengthen the 
scientific merit of project findings. 

• In a climate of increasing work intensification, participating schools and teachers need 
to be convinced that there are benefits to be gained from their collaboration in the 
project. These do not need to be in the form of assurances that test scores will rise but 
they do need to be convinced that their professional needs are recognised and supported, 
and that the work will be useful to them. 



We wanted, of course, to avoid ‘lethal mutations’ (Brown, 1992). Our characterisation of the support from 
academics as ‘critical friendship’ to schools provided a means of moderating teachers adaptations by reference to 
principles derived from previous research. For example, teachers who wanted to interpret ‘sharing criteria of quality 
with students’ as encouragement to distribute model answers for reproduction in examinations, could be challenged 
to consider the importance of a learning orientation, rather than a performance orientation, if students are to learn 
how to learn. 
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Principles for research design can be extracted from the assumptions outlined above, namely: 
collaboration, intervention, parsimony, utility, relevance, adaptability, iteration, portability, 
sustainability, scalability, generalisability, plausibility, theoretical coherence, 
multidisciplinarity, reflexivity and criticality. 



Project Plan 

In so far as all projects within the TLRP have a principal aim to research ways of enhancing 
outcomes for learners^ , we needed to decide what outcomes are of interest and how we might 
measure them. We accepted that one measure would need to be students’ scholastic attainment as 
currently measured in schools through nationally prescribed key stage tests and examinations (at 
age 7, 11, 14 and 16). Participation and engagement as indicated by school attendance records 
are also relevant. However, our interest in learning how to learn as an outcome, as well as a 
process, led us to develop a dynamic assessment instrument. Although the nature of this 
instrument prevents us from using it on a wide sample in the current study, we will examine the 
relationship between some students’ performance on this measure and the standard measures of 
attainment that most children in England now undergo. We make the assumption that these 
outcomes are, at least in part, the result of classroom interactions with teachers, peers and other 
tools and artefacts, and that these classroom interactions and practices are influenced by 
teachers’ and students’ beliefs about learning. In the same way, we hypothesise that teachers’ 
and students’ individual and collective beliefs and practices are themselves ‘outcomes’ and, at 
least in part, the result of their experience of professional development, school culfrue, 
management practices and networking opportunities both within and across schools. Through 
this process of backward-mapping we have constructed a ‘causal model’ with process and 
outcome links that we intend to investigate (see Figurel). 

Figure 2 provides more detail of the data we are collecting and the specific role these elements of 
data are expected to have in illuminating the impact of interventions on practice and outcomes. 
Thus we make a distinction between those data that are measures of antecedents and outcomes, 
and those that are principally indicators of mediating or contextual variables. In general, the 
former, derived from survey or test instruments, will render longitudinal (pre- and post- 
development) quantitative information both within and across schools. The latter, Rawing on 
mainly narrative sources, will provide data that will be used to investigate the nafrue of links 
indicated by correlations. We intend to place most emphasis on comparisons between cases (e.g. 
different schools with different rates and patterns of development) but it may be possible to make 
some use of control schools, which have no contact with the project, by drawing a matched 
sample from a large national data-set available to Cambridge researchers. Case comparisons will 
be important because the ‘intervention’ is not tightly controlled. This has been quite deliberate 
because we are particularly interested in how schools and teachers take up, adapt and implement 
project ideas and what conditions promote effective practice and enhanced outcomes. 

In Figure 2, we also distinguish between the locus and the focus of the intervention of the project 
team. Our primary intervention is in the form of an initial in-service session for a school or group 



’ In our project, learners are construed as both students and teachers, and, at the risk of reification, even schools as 
organisations might be said to learn. 
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of schools, followed by audit of practice and action-planning. The focus of this session is 
presentation and discussion of research and ideas for encouraging learning how to learn through 
assessment for learning'®. What happens next depends to an extent on schools’ own analysis of 
their needs although follow-up workshop activities, on questioning, feedback, sharing criteria of 
quality, and peer- and self-assessment, were made available. These all focus on classroom 
practice. 

The major intervention with a focus on school policy and practices, is feedback of the results of 
the staff questionnaire. This questionnaire has a dual purpose". On the one hand it is designed to 
provide the research team with data of staff perceptions and values in relation to school 
assessment practice, professional learning, and management systems and practices. On the other 
hand, results can be used as diagnostic tools for school self-evaluation. Further materials for 
follow up activities are made available to schools to help them respond to the issues for school- 
level policy and practice that this instrument reveals. 

One of the principles underpinning the project strategy is that research instruments should be 
useful to teachers and schools as well as useful to researchers. Thus there is an intention that all 
instrumentation should be in a form that can be ‘left behind’ for teachers to use, alongside more 
familiar forms of in-service resources. In order to manage this, the role of the project website has 
developed in the expectation that it will become an expanding resource of materials, instruments 
and examples of practice that teachers can use and exchange. For this reason, the website is 
itself a kind of intervention, which principally operates at network level, having a role in 
knowledge creation and exchange between teachers and schools. 

Some of the parts of this plan were described in the project proposal but much of the detail has 
become clear only as the project has evolved in the iteration between development and research, 
and in the attempt to balance what might be methodologically desirable with what is practically 
feasible and manageable. We see this as appropriate, if not inevitable. As we arrive at the half- 
way mark in our time-scale we are reasonably confident that we have all the components for 
development and data collection in place, so we can turn more of our attention to analysis and 
interpretation. Our instrumentation was developed on the basis of existing theory, and the 
‘substantive’ theory implicit in our causal model, which we intend to test and modify. But we 
also hope to build some new theory and to examine how our findings might illuminate, or be 
illuminated by ‘formal’ theories such as activity theory (Engestrom, 1987), distributed cognitions 
theory (Salomon, 1993), or communities of practice theories (Wenger, 1998). The final right- 
hand column of our plan (Figure 2) records some of the theoretical resources we expect to bring 
to this task. Given the fact that the team is made up of academics and researchers with different 
backgrounds (ranging from physics to English, and philosophy to sociology) we decided not to 
locate this study in a single, narrow, theoretical framework but to create a project which would 
enable us to test a number of analytical perspectives. We believe that the phenomena we are 



Development materials and other project resources are publicly available on our project website: 
htto://www. leamtoleam.ac.uk 

" A paper ‘A Servant of Two masters: designing research to advance knowledge and practice’ provides more detail 
of the development and use of this instmment. This is given at this AERA meeting in the symposium, ‘Talking, 
working and learning with teachers and school leaders: the Cambridge Symposium’. 



interested in can be viewed through a number of lenses that each can enrich and deepen our 
vision. 



What kind of study is this? 

We began work on the proposal for this project on 18*'^ November 1999'^ We were bidding to be 
part of the TLRP, which was set up specifically to respond to public concerns about educational 
research quality, so we were cognisant of the debates surrounding research design, including 
those in the US. However, we never explicitly described our proposal as an attempt to create a 
project in any particular design category, although we did mention action research as a broad, 
general category. For the reasons stated above, we knew it had to involve an intervention, and, 
since we were required to show outcomes, we were obliged to consider how these should be 
defined and measured, what they might be outcomes of, and how we would know - hence our 
causal model. However, from this point we proceeded almost entirely pragmatically (a very 
British characteristic) and developed and honed the project design as we encountered new issues. 
It is only now, after two years, that we can see the scope of the whole project properly, although 
we still expect to have to make some adjustments before we finish. This kind of iterative design 
would not be acceptable in a laboratory experiment, but we were committed to working in the 
field, and the field gives rise to the unexpected which must be taken into account. So, if this 
project is not an experiment, in the formal sense, what is it? 

In an Occasional Paper from the RCBN, Gorard (2002) sets out four formal approaches to 
combining both qualitative and quantitative methodological approaches: 

1. a ‘Bayesian’ model in which the findings of high quality studies (mostly large-scale 
randomised controlled trials) are ‘engineered’ into practice; 

2. a ‘new political arithmetic’ model in which a first (descriptive) phase of problem-definition 
through large-scale analysis is followed by a second (explanatory) phase of detailed 
examination of a sub-set of cases; 

3. a model of ‘complex interventions’ in which a first phase involves initial design of an 
intervention based on existing theory with explicit causal mechanisms of proposed effects, 
followed by formative evaluation of implementation using qualitative approaches, then a 
third phase involving a full feasibility study involving both in-depth feedback and 
quantitative measurement with alternative treatments or controls. 

4. ‘design experiments’'^ which derive from ideas associated with engineering (Brown 1992) 
and are concerned with how artefacts behave under different conditions, thus implementation 
in realistic settings is the key and use is monitored both numerically and no-numerically. 

Neither the first or second of these provide a good description of our study but the third and 
fourth warrant attention. Clearly, we have a ‘complex intervention’ which was designed on the 
basis of existing theory and our ‘causal chain’ provides hypotheses about how interventions 
might lead to effects. Similarly our approach is pragmatic and ‘sacrifices standardisation for 



It was more than a year later that we were informed that we had succeeded in the ESRC competition and could 
start work 

‘Design studies’ according to those who object to this appropriation of the term from natural science. 
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realism, and means that the natural variability in delivery that occurs between practitioners must 
be recorded and monitored by in-depth means as well as the more traditional measures’ (Gorard, 
2002:16). However, the source of this model is health service interventions where the goal is to 
obtain estimates of the average effect of the intervention with useful further information on the 
external factors that support or attenuate this effect (Moore, 2002:5). Thus the randomised 
controlled trial (RCT) is the basis for these studies. The problem with this for us is that the 
treatment intervention is regarded statically and the variations of interest are mainly variations in 
the quality of the delivery. McCulloch et al. (2002) point out that this notion is problematic even 
in the health care context. RCTs work well in the evaluation of modem drug therapies but less 
well in areas such as surgery where definitions are not precise and treatments may overlap, 
where performance improves over time therefore the first treatment may be inferior to a later 
one, and where patients often reject RCTs because they do not want treatment to be decided by 
chance. All these aspects have parallels in the context of innovations in teaching and similar 
problems arise. 

L2L is interested in variations in implementation but we also hold a dynamic view of 
interventions and do not regard them as artefacts that can be definitively ‘fixed’. So, the notion 
of design experiments appeals because they are recognised to be ‘messier, taking place in real 
settings, monitoring many variables, characterising the situation ethnographically, revising the 
procedures at will, allowing participants to interact, developing profiles rather than hypotheses, 
and involving users and practitioners in the design’ (Gorard, 2002:17). In this model the artefact, 
rather like an engineered ‘gadget’, can be modified and improved through continuous tinkering. 
This metaphor has some resonance with what teachers do in classrooms, as Huberman (1992) 
observed: 

Essentially teachers are artisans working primarily alone, with a variety of new and 
cobbled together materials, in a personally designed work environment. They gradually 
develop a repertoire of instructional skills and strategies, corresponding to a progressively 
denser, more differentiated and well integrated set of mental schemata: they come to read 
the instructional situation better and faster, and to respond with a greater variety of tools. 
They develop this repertoire through a somewhat haphazard process of trial and error, 
usually when one or other segment of the repertoire does not work repeatedly... When 
things go well, when routines work smoothly ... there is a rush of craft pride.. When things 
do not go well... cycles of experimentation ...are intensified... Teachers spontaneously go 
about tinkering with their classrooms, (cited in OECD, 2000, chap 3, para. 12) 

Allowing both variation in the context of implementation and in the nature of the artefact invites 
research approaches that incline towards documentation of the process of modification over time. 
Indeed this is a criticism of design experiments as currently enacted. For example, Shavelson et 
al (2003) claim that design studies, as they prefer to call them, ‘often rely on narrative accounts 
as data for modifying theory and the design of artefacts iteratively over time’ (p.27). Recalling 
Brown’s (1992) original formulation that the design and testing of an innovation should iterate 
between classroom and laboratory, and quoting with approval the examples described by 
McCandliss, Kalchman and Bryant (2003), they argue that generalisable conclusions can only be 
warranted if other methods, such as traditional experiments, are integrated with design studies in 
order to capitalise on the respective strengths of both. 
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We have some sympathy with this line of argument but given the complexity of the innovations 
we are interested in promoting and the fact that they concern the structures of professional 
development and the cultures of schools, as well as changes in teachers’ classroom practices, we 
can not see how we could create a robust experimental design, especially an RCT. Since one of 
our key units is the school, the idea of randomising treatments over large numbers of schools is 
simply not viable. 

All is not lost however, because, as Gorard (2002) points out, in order to evaluate the 
innovations, ‘the outcome(s) of interest must be fixed first, else, if it is modified along with the 
intervention during the study, there is no fixed point to research’ (p.l 8). In other words, even if 
the content, processes and context of the innovation changes, the design of the study can be 
robust if the definition and measurement of outcomes remains reasonably stable. This reflects the 
design of L2L where a set of antecedent/outcome measures have been developed to measure 
change across settings and over time, whilst other qualitative or quasi-quantitative measures will 
be used to document features of change that may have explanatory power. This is an integrated, 
not a phased design, however. 

Where we differ from Gorard is over his assertion that ‘it was never the intention that the same 
dataset would be used to both modify and test the artefact/intervention’ (2002 p.l 8). We are 
explicitly using the results from baseline data collection at school level (our staff questionnaire) 
as an intervention at Level 2. Whilst the research team will use these data as a summative 
measure of change; schools can use these data for formative self-evaluation. Thus the first 
administration of this instrument can be expected to have an impact on what subsequently 
happens in schools. We see no problem here as long as this is taken into account when the 
instrument is again administered at the end of the project and the comparative data analysed. 
Moreover, this practice ensures continued co-operation from schools which, in the UK, are 
increasingly reluctant to engage with research that feeds into learned journals and advances 
researchers’ careers but does not benefit them. If teachers give their time they understandably 
want feedback in return. 



Conclusion 

Apparently, the Design-Based Research Collective in the US was forming (Kelly 2003) as we 
were writing our proposal and we were not aware of it at the time. However, now that the design 
experiment, as an approach to educational research in complex settings, has been more fully 
explored (see the special issue of Educational Researcher, 32(1) January/February 2003), the 
resonance between these ideas and our own attempts to construct a project, which will be both 
scientifically rigorous and practically useful, are remarkable. We would claim that it has all the 
characteristics of design experiments described by Cobb et al. (2003). It is iterative, process- 
focussed, interventionist, collaborative, multileveled, utility oriented and theory drive. Our 
additional principles (see above) suggest that other characteristics may need to be considered 
also. 
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Our experience suggests that, through combining qualitative and quantitative approaches, it is 
possible to integrate narratives of the evolving design process with more traditional studies 
comparing outcomes, with or without formal controls, to warrant a causal claim. Of course, we 
are still a long way from completing our project so whether we will be successful in this remains 
to be seen. The methodological issues are important and difficult but some of the major issues 
we face at the present time have more to do with managing such complex projects. 

We have posed ourselves some big questions to answer, but even if we had been more modest in 
our aspirations we suspect we would have faced similar problems. Simultaneously developing 
interventions, developing research instruments of different kinds, recruiting and supporting work 
in schools and districts, collecting and analysing data , documenting modifications, developing 
theoretical resources, maintaining dialogue with user groups etc. is like juggling. Early on, with 
attention to methodological as well as practical considerations, we decided a division of labour 
with academics providing in-service support and critical friendship to schools and our two full- 
time, and one part-time, researchers concentrating on instrument development and data 
collection. Similarly, the different universities are responsible for work in different school 
districts (i.e. Cambridge for Essex and Hertfordshire, King’s for Oxfordshire and Medway, and 
Reading and the OU for Redbridge, Kent and Somerset) and leading the research at one level 
(i.e. King’s for research in classrooms, Cambridge for school-level research, and Reading/OU for 
network-level research). The whole team is however involved in analysis and theoretical 
development and testing. 

It would be easy for a project such as this to fragment into sub-projects. We did not want this to 
happen because the validity of our findings depend on our ability to present an integrated picture. 
Sometimes this has not seemed efficient because it is necessary to have regular whole team 
meetings to keep all seventeen members of the team travelling in the same direction together. 
Despite the fhistrations that this can engender, we believe that this is essentially the way a 
project such as this can contribute to research capacity building. We learn by listening to one 
another as we struggle to solve problems and share our different forms of expertise. Sometimes 
the learning curve is very steep and not always smooth. Nevertheless if each of us goes away and 
shares our new knowledge and experience with others in our institutions, and in other projects, 
the growth in capacity could be exponential. This networking approach also has neat parallels 
with the approach to knowledge creation and exchange in schools that is the substantive focus of 
our project. Building and deepening system capacity is, after all, the name of the game. 
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FIGURE 1: L2L ‘CAUSAL’ MODEL 
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Figure 2: L2L Development and Research Plan 
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